What Is Where: Inferring Containment Relations from Videos
نویسندگان
چکیده
In this paper, we present a probabilistic approach to explicitly infer containment relations between objects in 3D scenes. Given an input RGB-D video, our algorithm quantizes the perceptual space of a 3D scene by reasoning about containment relations over time. At each frame, we represent the containment relations in space by a containment graph, where each vertex represents an object and each edge represents a containment relation. We assume that human actions are the only cause that leads to containment relation changes over time, and classify human actions into four types of events: movein, move-out, no-change and paranormal-change. Here, paranomal-change refers to the events that are physically infeasible, and thus are ruled out through reasoning. A dynamic programming algorithm is adopted to finding both the optimal sequence of containment relations across the video, and the containment relation changes between adjacent frames. We evaluate the proposed method on our dataset with 1326 video clips taken in 9 indoor scenes, including some challenging cases, such as heavy occlusions and diverse changes of containment relations. The experimental results demonstrate good performance on the dataset.
منابع مشابه
Inferring actor communities from videos
In recent years there has been a growing interest in inferring social relations amongst actors in a video using audiovisual features, co-appearance features or both. The discovered relations between actors have been used for identifying leading roles, detecting rival communities in a movie plot etc. In this paper we propose an unsupervised method which uses the video’s transcript and closed cap...
متن کاملTracking Occluded Objects and Recovering Incomplete Trajectories by Reasoning about Containment Relations and Human Actions
This paper studies a challenging problem of tracking severely occluded objects in long video sequences. The proposed method reasons about the containment relations and human actions, thus infers and recovers occluded objects identities while contained or blocked by others. There are two conditions that lead to incomplete trajectories: i) Contained. The occlusion is caused by a containment relat...
متن کاملWhere and Why Are They Looking? Jointly Inferring Human Attention and Intentions in Complex Tasks
This paper addresses a new problem jointly inferring human attention, intentions, and tasks from videos. Given an RGB-D video where a human performs a task, we answer three questions simultaneously: 1) where the human is looking attention prediction; 2) why the human is looking there intention prediction; and 3) what task the human is performing task recognition. We propose a hierarchical model...
متن کاملDetecting Hierarchical Ties Using Link-Analysis Ranking at Different Levels of Time Granularity
Social networks contain implicit knowledge that can be used to infer hierarchical relations that are not explicitly present in the available data. Interaction patterns are typically affected by users’ social relations. We present an approach to inferring such information that applies a link-analysis ranking algorithm at different levels of time granularity. In addition, a voting scheme is emplo...
متن کاملLearning Social Relations from Videos: Features, Models, and Analytics
Despite the progress made during the last decade in video understanding, extracting high-level semantics in the form of relations among the actors in a video is still an under-explored area. This chapter discusses a streamlined methodology to learn interactions between actors, construct social networks, identify communities, and find the leader of each community in a video sequence from a socio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016